WAIS  promises  easy  text  retrieval 


Prototype  links  Mac, 
Connection  Machine 


By  Henry  Norr 

Cupertino,  Calif.  — ■  Thinking 
Machines  Corp.,  a  pioneer  in  the 
development  of  high-powered 
parallel-processing  supercomput- 
ers, has  joined  with  Apple,  Dow 
Jones  &  Co.  and  KPMG  Peat  Mar- 
wick  to  develop  a  new  technology 
designed  to  simplify  the  retrieval  of 
textual  information  stored  in  per- 
sonal files,  corporate  records  and 
remote  databases. 

Called  the  Wide  Area  Informa- 
tion Server  (WAIS)  project,  the  col- 
laborative venture  has  been  under 
way  for  almost  two  years.  Peat  Mar- 
wick  recently  completed  a  four- 
month  experiment  with  the  system,, 
using  WAIStation,  a  prototype  Mac 
front  end  developed  by  Thinking .. 
Machines  o  f .  C  am  bridge..  Mass., 
'Engineers' from'  Applet" '  Advanced". 
Technology  Group  have  combined, 
the  WAIS  technology  with  a  custom 
.interface  to  build  a  prototype  per- 
sonal electronic  newspaper. 

The  WAIS  project  was  designed 
in  part  to  address  problems  caused 
by  the  proliferation  of  electronic 
data  within  large  organizations. 

"Corporations  are  starting  to  gag 


Brewster  Kahle,  WAIS  project  leader,  helped  develop  an  experimental  text-retrieval  system 
that  tan  use  a  Thinking  Machines  supercomputer  as  a  server  and  the  Mac  as  a  front  end. 


on  gigabytes  of  word  processing  . 
files,  memos,  reports,  articles  and  E- 

''•■m'aii'  archives",''  said  'Brewster  "••foffiei;" 
WAIS  project  leader  for  Thinking 
Machines.  "Corporate  memory  is 
stored  in  this  form,  but  executives 

■■  have  no  easy  way  to  get  at  it." 

But  the  WAIS  'project  was 
intended  from  the  beginning, 
Kahle  said,  to  be  more  than  a  tradi- 
tional executive  information  system 
working  only,  within  corporate 


bounds.  The  objective  was  to  ..lay 
the  foundations  Sox  .a.  scalable  sys- 
't'em  'thaHvbulci  allow  users  to  tap  a 
variety  of  data  sources,  including 
large"  commercial  databases, 
through  a  uniform  interface.  Users, 
according  to  the.' plan,  should  be 
able  to  search  for  any  available 
information  without  having  to  mas- 
ter the  internal  organization  and 
query  techniques  of  each  source. 
See  Thinking  Machines,  Page  24 


Peat  Marwick  tries  'partner-friendly'  system 


/'■Wherr^ 

Machines  Corp.,  Dow  Jones  &. 
Co.  and  Apple  first  broached  the 
concept  of  the  Wide  Area  Infor- 
mation Server  (WAIS)  with 
KPMG  Peat  Marwick,  representa- 
tives of  the  accounting  giant  were 
intrigued  but  cautious,  according 
to  Brewster  Kahle,  project  leader 
for  Thinking  Machines. 

They  weren't  interested,  he  said, 
in  another  complex  querying  appli- 
cation that  busy  tax  consultants, 
accountants  and  managers  would 
never  bother  to  use.  But  they 
agreed  to  participate  in  the  project, 
according  to  Kahle,  on  the  promise 
of  a  system  that  would  be  genuine- 
ly "partner-friendly,"  with  "no  alge- 
bra —  no  ifs,  ands  or  buts." 

After  a  year  of  preliminary 
work,  an  experimental  WAIS 
R&D  project  went  on-line  at  Peat 
Marwick  last  October.  About  10 
users  at  the  company's  Montvale, 
N.J.,  headquarters,  including 
"very  senior  partners,"  took  part  in 
the  experiment,  along  with  two 
others  in  Manhattan  and  10  more 
on  the  West  Coast,  according  to 
Robin  Palmer,  senior  manager 


•  and. -WATS'  project  •  leader  :'at  . 
KPMG  Peat  Marwick  in  San  Jose, 
Calif.  The  remote  users  were  con- 
nected by  leased  lines  to  a  WAIS 
server  running  on  a  Connection 
Machine,  a  Thinking  Machines 
parallel-processing  system,  in- 
stalled in  Montvale. 

The  Peat  Marwick  experiment 
relied  on  WAIStation,  a  Mac- 
based  client  software  program 
developed  by  Thinking  Machines, 
as  a  front  end.  To  prepare  a  query, 
users  need  only  enter  the  subject 
they  are  interested  in,  in  English 
—  "IBM  and  Motorola,"  for 
instance,  or  "recent  developments 
in  personal  computers"  —  in  a 
text  field  labeled  "Look  for  docu- 
ments about."  They  then  drag 
icons-  representing  possible 
sources,  local  or  remote,  into 
another  field. 

When  the  query  is  run,  the 
Macintosh-based  front  end 
encodes  the  search  string  accord- 
ing to  the  WAIS  protocol  and 
passes  it  to  the  specified  servers. 
Each  server  translates  the  query 
into  its  own  language,  locates 
matching  articles  and  returns 


/..the  Te.sufts:  t6:the'  front  end;-  • 
'  The  WAIStation  application 
then  displays  headlines  for  each 
article;  the  citations  are  ranked 
according  to  probable  relevance, 
based  on  algorithms  that  consid-. 
er  the  position,  frequency  and 
proximity  of  desired  terms  within 
the  text. 

By  double-clicking  on  the  head- 
line, users  can  get  the  full  text  of 
any  of  the  articles.  And  if  the  user 
drags  the  most  useful  titles  -  into  ;a 
bin  labeled  "Similar  to"  and 
reruns  the  search,  the  system  will 
track  down  additional  articles  that 
share  a  large  number  of  words 
with  those  selected. 

Peat  Marwick  completed  its 
WAIStation  testing  in  February. 
In  part  because  the  cost  of  main- 
taining a  real-time  wide-area  link 
among  its  many  offices  would.be 
"substantial,"  according  to  Palmer, 
the  company  has  not  made  a  com- 
mitment to  the  system  and  is  still 
considering  a  variety  of  alterna- 
tives. But,  he  said,  "we  are  still 
extremely  interested  in  the  WAIS 
concepts.  It's  a  most  promising 
technology."  —  By  Henry  Norr 


Thinking  Machines .      From  Page  22 

The  WAIS  system  has  three  components: 
>  Server  software.  Any  information 
source  capable  of  locating  and  presenting 
text  in  response  to  a  request  in  WAIS  format 
can  function  as  a  server:  the  source  can  be 
on  the  user's  own  machine,  on  a  LAN  or  at  a  ^ 
remote  site  connected  by  modem.  The  ■ 
WAIS  client  software  can  keep  track  of  mul- 
tiple sewers,  search  any  or  all  in  response  to 
a  single  request  and  consolidate  the  results. 

Thinking  Machines  now  includes  the 
WAIS  text-indexing  and  retrieval  software 
free  with  its  Connection  Machines,  a  line  of 
massively  parallel  systems  that  range  in  price 
from  S  100,000  to  $5  million,  according  to  \ 
Kahle.  In  addition,  the  companies  partici- 
pating in  the  project  developed  a  sample 
sewer  that  runs  on  standard  Unix  systems.  ; 
But  anv  text-retrieval  program  on  any  plat-  ; 
form,  including  the  Mac,  could  be  adapted 
to  function  as  a  WAIS  sewer. 

>  Protocol.  To  foster  the  development  of 
WAIS-compatible  data  sources,  the  four 
companies  created  an  open  protocol  for 
transmitting  queries  and  responses.  It  is 
based  on  an  existing  standard,  the  National 
Information  Standards  Organization's  :' 
Z39.50  protocol,  but  is  enhanced  in  several  ! 
ways,  such  as  by  the  addition  of  support  tor  ;. 
audio  and  video  information.  ; 

>•  Clients.  WAIS  was  designed  to  support  < 
a  varietv  of  interfaces  running  on  various  j 
platforms  and  tailored  to  different  niches. 

The  system  does  not  rely  on  a  specialized 
querv  language;  the  front  end  simply  passes 
English-language  search  strings  entered  by 
the'  user  to  the  sewer. 

In  addition  to  the  prototype  WAIStation 
interface  and  Apples  experimental  personal 
newspaper,  front  ends  already  are  available  , 
for  the  X  Window  System  and  GNU  emacs, 
an  extensible  text  editor  that  runs  under  a 
freelv  distributed  Unix-like  operating  system 
developed  at  the  Massachusetts  Institute  of 
Technology  in  Cambridge. 

To  promote  the  WAIS  concept.  Thinking 
Machines  is  making  source  code  for  the  sys- 
tem available  over  the  Internet  or  by  mail. 
The  code  comes  free  of  charge  but  without 
support.  Using  the  software,  programmers  at 
MIT  and  elsewhere  already  have  created 
more  than  20  WAIS  sewers,  including  a  poet- 
w  sewer,  a  weather  sewer  and  a  catalog  ol 
"ovennnent  programs.  Thinking  Machines 
will  maintain  a  publicly  accessible  directory 
of  sewers,  which  will  include  descriptions  ol 
all  known  sewers  ami  special  files  that  allow 
WAIS  front  ends  to  plug  into  them.n 


